An  Optimal  Basis  Identification  Technique  for 
Interior- Point  Linear  Programming  Algorithms 

R.A.  Tapia 
and 

Yin  Zhang 
April,  1989 

(Revised  September,  1990) 


TR89-1 


Report  Documentation  Page 

Form  Approved 

OMB  No.  0704-0188 

Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 

VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 

1.  REPORT  DATE 

SEp1990  2.  REPORT  TYPE 

3.  DATES  COVERED 

00-00-1990  to  00-00-1990 

4.  TITLE  AND  SUBTITLE 

An  Optimal  Basis  Identificaiton  Technique  for  Interior-Point  Linear 
Programming  Algorithms 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Computational  and  Applied  Mathematics  Department  ,Rice 

University, 6100  Main  Street  MS  134, Houston, TX, 77005-1892 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

10.  SPONSOR/MONITOR'S  ACRONYM(S) 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

15.  SUBJECT  TERMS 

16.  SECURITY  CLASSIFICATION  OF:  17.  LIMITATION  OF 

18.  NUMBER  19a.  NAME  OF 

a.  REPORT  b.  ABSTRACT  c.  THIS  PAGE 

unclassified  unclassified  unclassified 

23 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


An  Optimal  Basis  Identification  Technique 
for  Interior-Point  Linear  Programming  Algorithms 
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Abstract 

This  work  concerns  a  method  for  identifying  an  optimal  basis  for  linear  program¬ 
ming  problems  in  the  setting  of  interior  point  methods.  To  each  iterate  xk  generated 
by  a  primal  interior  point  algorithm,  say,  we  associate  an  indicator  vector  qk  with 
the  property  that  if  xk  converges  to  a  nondegenerate  vertex  x* ,  then  qk  converges  to 
the  0-1  vector  sign(x*).  More  interestingly,  we  show  that  the  convergence  of  qk  is 
quadratically  faster  than  that  of  xk  in  the  sense  that  || qk  —  q*||  =  0(\\xk  —  a; * 1 1 2 ) .  This 
clear-cut  separation  and  rapid  convergence  allow  one  to  infer  at  an  intermediate  stage 
of  the  iterative  process  which  variables  will  be  zero  at  optimality  and  which  will  not. 
We  also  show  that  under  suitable  assumptions  this  method  is  applicable  to  dual  as 
well  as  primal-dual  algorithms  and  can  be  extended  to  handle  certain  types  of  degen¬ 
eracy.  Numerical  examples  are  included  to  corroborate  the  convergence  properties  of 
the  indicators.  The  practical  limitations  of  the  indicator  technique  are  also  discussed. 
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1  Introduction 


This  paper  concerns  linear  programs  in  the  standard  form: 

minimize  cTx, 

subject  to  Ax  =  b,  (1.1) 

x  >  0, 

where  c,  x  £  R”,  b  £  Rm,  A  £  RmXn(m  <  n)  and  A  has  full  rank  m.  The  dual  linear 
program  of  (1.1)  is 

maximize  bTy , 

subject  to  ATy  4-  z  =  c,  (1.2) 

*  >  0, 

where  z  £  Rn  is  the  vector  of  dual  slack  variables. 

The  simplex  method  for  linear  programming  can  be  viewed  as  an  active  set  method  that 
utilizes  the  combinatorial  structure  of  linear  programs  and  has  an  exponential  worst-case 
complexity.  On  the  other  hand,  interior-point  methods  such  as  the  ellipsoid  algorithm  and 
the  Karmarkar  algorithm  do  not  rely  on  the  combinatorial  structure  and  possess  polynomial 
complexity.  Recent  developments  have  demonstrated  that  interior-point  algorithms  have  the 
real  potential  to  be  competitive  in  practice  with  the  simplex  method. 

Theoretically,  with  integer  data  an  interior-point  algorithm  can  be  terminated  when  the 
current  iterate  is  sufficiently  close  to  an  optimal  solution  and  then  is  rounded  to  the  nearby 
optimal  solution.  However,  theoretical  termination  criteria  of  this  kind  are  difficult  to  define 
and  are  usually  inefficient. 

A  promising  approach  for  improving  the  efficiency  of  Karmarkar-type  interior  point  algo¬ 
rithms  seems  to  be  the  development  of  reliable  techniques  for  identifying  optimal  basic  and 
nonbasic  variables  in  the  early  stages  of  an  interior  point  iterative  process.  In  this  way  either 
an  early  termination  or  a  reduction  in  problem  size  can  be  obtained.  In  other  words,  the 
efficiency  of  interior  point  methods  may  be  improved  by  utilizing  the  combinatorial  structure 
of  linear  programs. 

Suppose  that  an  interior  point  method  is  generating  a  sequence  {xk}  that  is  converging 
to  an  optimal  solution  of  the  linear  program  (1.1).  For  simplicity,  let  us  assume  that  x* 
is  a  nondegenerate  basic  feasible  solution.  At  the  k-th.  iteration,  for  example,  if  one  can 
partition,  using  some  identification  technique,  the  current  iterate  xk  into  a  set  of  m  likely 
basic  variables  and  a  set  of  n  —  m  likely  nonbasic  variables  with  a  reasonable  certainty,  then 
one  may  want  to  set  the  nonbasic  variable  candidates  to  zero  in  the  constraint  equations 
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Ax  =  b  and  solve  the  resulting  square  system  for  the  basic  variable  candidates.  If  the 
solution  obtained  in  this  manner  is  indeed  a  basic  feasible  solution  and  the  corresponding 
reduced  costs  are  all  nonnegative,  then  the  optimal  solution  has  been  obtained  and  the 
algorithm  can  be  terminated.  Otherwise,  one  proceeds  with  the  interior  point  algorithm  to 
the  next  iteration.  In  a  procedure  of  this  kind,  the  identification  technique  plays  the  central 
role.  In  order  to  successfully  terminate  the  interior  point  algorithm  as  early  as  possible,  the 
identification  technique  must  be  reliable,  inexpensive,  and  most  importantly,  able  to  identify 
an  optimal  basis  in  an  early  stage  of  the  iterative  process. 

In  the  presence  of  degeneracies,  the  situation  becomes  more  complicated.  For  example, 
it  is  no  longer  a  straightforward  matter  to  determine  an  optimal  basis  or  even  to  check  the 
optimality  of  a  basic  feasible  solution  in  the  case  of  primal  degeneracy  (i.e.,  when  there  are 
more  than  n  —  m  zero  variables)  even  after  the  zero  and  nonzero  variables  have  been  correctly 
identified.  Nevertheless,  any  information  telling  us  which  variables  are  zero  and  which  are 
nonzero  at  optimality  is  still  of  some  value  and  may  be  used  to  improve  the  efficiency  of 
interior  point  methods.  For  instance,  once  it  has  been  determined  that  some  variables 
are  zero  at  optimality,  they  may  be  eliminated  from  the  problem,  yielding  a  reduction  in 
the  problem  size.  Also  inequality  constraints  may  be  removed  if  their  corresponding  slack 
variables  are  identified  as  nonzero  at  optimality.  For  large  scale  problems,  these  reductions 
in  problem  size  may  result  in  savings  in  computational  effort. 

Working  primarily  with  the  Karmarkar  algorithm  or  one  of  its  variants  the  optimal  basis 
identification  problem  has  been  considered  in  recent  years  by  several  authors,  including 
Kojima  [9],  Ye  and  Todd  [19],  Asic  et  al.  [2],  Todd  [13],  Ye  [17]  and  Gay  [6]. 

In  this  paper,  we  will  propose  and  study  a  new  identification  technique  using  an  indi¬ 
cator  to  identify  the  optimal  basis.  This  indicator  will  be  shown  to  possess  several  elegant 
mathematical  properties.  However,  its  practical  applicability  seems  to  be  limited  because 
it  requires  nondegeneracy  assumptions,  and  for  highly  sparse  problems  the  added  expense 
incurred  in  calculating  the  indicator  may  dominate  any  gain  obtained  from  its  use. 

This  paper  is  organized  as  follows.  In  Section  2,  we  define  our  indicator  and  study  its 
properties.  The  applications  of  the  indicator  to  primal,  dual  and  primal-dual  algorithms  for 
identifying  an  optimal  basis  are  developed  in  Section  3.  Section  4  deals  with  the  numerical 
computation  of  the  indicator.  The  extension  of  our  indicator  technique  to  degenerate  prob¬ 
lems  is  studied  in  Section  5.  Described  in  Section  6  are  methods  for  randomly  generating 
nondegenerate  and  primal  degenerate  problems  for  use  in  numerical  experimentations.  Nu¬ 
merical  examples  are  presented  in  Section  7  and  some  concluding  remarks  are  given  in  the 
final  section. 
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2  Definition  and  Properties  of  the  Indicator 


A  key  ingredient  in  essentially  every  interior  point  method  motivated  and  influenced  by 
Karmarkar’s  milestone  work  of  1984  [8]  is  a  matrix  of  the  form  DAT(AD2AT)~1AD,  where 
-D  is  a  diagonal  matrix  that  changes  at  every  iteration.  In  primal  algorithms  (affine  variants 
of  the  Karmarkar  algorithm,  for  example,  see  [3],  [5]  and  [16]),  we  see  D  =  diag(x).  In 
dual  algorithms  (see  Adler  et  al.  [1]  and  Monma  and  Morton  [11],  for  example),  we  see 
D  =  [diag(z)]~\  where  2  is  the  vector  of  dual  slack  variables.  And  in  primal-dual  algorithms 
(see  Ivojima  et  al.  [10],  for  example),  we  see  D 2  =  diag[x}\diag[zf\~l .  In  this  work,  we  will 
show  that  under  suitable  assumptions,  optimal  basis  information  can  be  obtained  from  the 
diagonal  of  this  matrix  DAT  [AD2  AT )  1  AD,  which  we  shall  call  the  indicator  vector  or 
simply  the  indicator. 

Now  we  give  a  formal  definition  of  the  indicator.  For  a  fixed  matrix  A  <=  RmXn(m  <  n), 
consider  the  matrix-valued  function  H  :  Rn  —>  Rnxn  defined  by 

H(d)  =  DAt(AD2At)+AD,  (2.1) 

where  d  e  R”,  D  =  diag(d)  and  the  superscript  “+”  denotes  the  generalized  inverse.  We 
will  be  primarily  interested  in  the  function  q  :  R”  -*  Rn  obtained  as  the  diagonal  of  H[d), 
i.e., 

q(d)  =  diag[H[d))  or  qi[d)  =  Ha[d),  i  =  1, 2, 3, ...,  n.  (2.2) 

This  function  q[d)  is  defined  for  all  d  €  R”;  however  it  will  not  be  continuous  at  points 
d  where  the  matrix  AD2AT  changes  rank  in  every  neighborhood  of  d.  At  points  d  where 
AD  A  has  constant  rank  in  some  neighborhood  of  d,  q[d )  will  be  infinitely  smooth. 

Ye  and  Todd  [19]  were  probably  the  first  to  observe  that  the  diagonal  elements  of  such  a 
matrix  contain  valuable  information.  In  a  primal-dual  context,  they  developed  an  interesting 
criterion  which  was  guaranteed  eventually  to  identify  the  optimal  basis  for  a  nondegenerate 
vertex.  Their  criterion  involved  several  quantities  including  the  diagonal  of  a  matrix  of  the 
form  (2.1).  However,  they  did  not  consider  the  diagonal  as  an  indicator  and  did  not  study 
the  properties  of  the  diagonal. 

As  the  first  step  towards  showing  that  q(d )  has  several  interesting  properties,  we  offer 
the  following  lemma. 

Lemma  2.1  If  q(d)  is  given  by  (2.2),  then  for  all  d  6  R”  we  have 

0  <  q(d)  <  1. 
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Moreover,  if  A  has  no  zero  columns  and  AD  has  full  rank,  then  qfd )  =  0  if  and  only  if 
di  =  0. 


Proof:  Observe  that  both  H(d)  and  I  —  H(d)  are  orthogonal  projections  satisfying  PT  =  P 
and  P 2  =  P;  they  are  therefore  positive  semi-definite  and  must  have  nonnegative  diagonals. 
This  proves  the  first  part. 

The  second  part  follows  from  the  formula 

qfd)  =  df  af(AD2AT)~1ai, 


where  a ,•  is  the  *-th  column  of  A  and  the  fact  that  the  quantity  af(AD2AT)  1a1-  >  0  under 
the  given  assumptions.  □ 

The  following  two  lemmas  are  crucial  to  the  development  of  our  theory. 

Lemma  2.2  For  d  £  Rn  let  q(d)  be  given  by  (2.2)  for  some  A  of  full  row  rank.  Consider 
the  n-dimensional  vector  d*  where  some  components  of  d*  may  be  infinite.  Assume  that  the 
components  of  d*  can  be  divided  into  two  sets:  Sa  and  Sp,  such  that 

1.  Sa  contains  m  and  Sp  contains  n  —  m  components  of  d* ; 

2.  all  elements  in  Sa  are  nonzero  and 

max{|d*|  :  d*  €  Sp} 

min{|4|  :<$€&}  ’  [  } 

3.  the  m  by  m  submatrix  of  A  consisting  of  columns  corresponding  to  the  components  in 
Sa  is  nonsingular. 

Then,  as  d  converges  to  d* 

hm  qfd)  =  qfd?)  =  j 

Proof:  Without  lose  of  generality,  we  assume  that 


if  d  j  £  Sa , 
if  d*  £  Sp. 


(2.4) 


Sa  =  {d',d* . d'J  and  Sp  =  {d‘m+ud-m+2, 


Let  Aa  and  Ap  be  the  submatrices  of  A  consisting  of  the  first  m  and  the  last  n  —  m  columns 
of  A,  respectively.  By  Assumption  3,  Aa  is  nonsingular.  Similarly,  let  Da  and  Dp  be  the 
submatrices  of  D  consisting  of  the  first  m  and  the  last  n—m .  columns  of  D,  respectively,  where 
D  =  diag(d)  and  d  converges  to  d*.  Also  let  r  =  l/min{|d,j  :  d*  £  5a}.  From  Assumption  2 
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we  know  that  r  is  well  defined  for  d  sufficiently  close  to  d*  and  all  the  diagonal  elements  of 
rDa  have  absolute  value  greater  than  or  equal  to  1.  Substituting  AD  —  [AaDa  ApDp]  into 
H(d),  we  have  for  d  close  to  d* 

H(d)  =  [rAaDa  rApD/3}T[Aa(rDa)2AaT  +  A^rDp)2  AgT)~1[rAaDa  rApDp\. 

Since  Assumption  2  implies  that  lim^-nj*  rDp  —  0,  it  is  evident  that 

lim  H(d)  =  H(d‘)=  [  “  ,  (2.5) 

d^d*  o  0 

which  proves  (2.4).  □ 

Since  d*  may  have  infinite  components,  we  will  define  the  derivatives  of  q  at  d*  by 
continuity. 

Lemma  2.3  Let  q(d)  be  given  as  in  (2.2)  and  d*  satisfy  the  conditions  in  Lemma  2.2. 
Define  the  derivatives  of  q(d)  at  the  point  d*  as  the  limits  of  corresponding  derivative  values 
at  d  6  Rn  as  d  converges  to  d* .  Then  q(d)  is  at  least  twice  continuously  differentiable  at  d* . 
Moreover,  the  Jacobian  matrix  of  q(d)  vanishes  at  d* ;  or  equivalently, 

Vqi(d*)  =  0,  i  =  1,2, ...,  n.  (2.6) 

Proof:  To  verify  the  differentiability  of  q(d)  at  d* ,  we  first  assume  that  d*  is  finite.  By 
Assumption  3  of  Lemma  2.2,  AD2 AT  is  nonsingular  at  d* .  It  is  therefore  nonsingular  in  a 

neighborhood  of  d*.  Observe  that  q(d)  is  a  rational  function  of  d  near  d*  with  a  nonzero 

denominator.  This  follows  from  the  well-known  adjoint  form  for  the  inverse  matrix  and  the 
fact  that  all  elements  of  AD2AT  are  quadratic  functions  of  d.  Therefore,  q(d)  is  actually 
infinitely  differentiable  at  finite  d*. 

Now  we  show  that  q(d)  has  continuous  second-order  partial  derivatives  even  at  an  infinite 
d*.  (Since  we  will  only  make  use  of  second  derivatives  in  our  analysis,  we  will  not  concern 
ourselves  with  derivatives  of  order  higher  than  2.)  Obviously,  our  definition  of  derivatives  at 
d*  guarantees  that  if  a  derivative  exists  at  d* ,  then  it  is  also  continuous  at  d* . 

Let  the  matrix- valued  function  P  be  defined  as 

P(d)  =  AT(AD2AT)~1A. 

From  the  definitions  of  H{d )  and  q(d),  we  have  that  for  any  finite  d  sufficiently  close  to  d* , 
Hij{d)  —  didjPij(d)  and  qfd)  -  Hu(d)  =  d2iPii(d),  (2.7) 
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where  Pij(d ),  the  (i,  j)-th  element  of  P(d),  is  given  by 

Pij(d)  =  aJ(AD2AT)~1aj, 

and  a,-  is  the  i-th  column  of  A.  By  the  Sherman-Morrison- Woodbury  formula  (see  [12,  p. 
50],  for  example), 

2  ede  +  e2 


Pij(d  +  ee*)  -  Pij(d)  = 


1  +  (2  edt  +  e*)Pu(d) 

Dividing  the  right-hand  side  by  e  and  letting  e  go  to  zero,  we  obtain 

dPiM) 


Pie(d)Pje(d). 


ddf 


-2  dePu(d)Pje(d). 


It  follows  from  (2.7)  and  (2.8)  that 

dq,{d)  _ 
dd , 


where  S/j  is  the  Kronecker  delta. 


■2d{Hij(d)Pij(d)  +  26ijdiPa(d), 


(2.8) 


(2.9) 


Let  us  consider  Sa,  S/3,  Aa,  A/ 3  and  Da ,  Dp  as  in  Lemma  2.2.  We  note  that  Assumption  2 
in  Lemma  2.2  guarantees  that  d*  /  0  for  all  d*  £  Sa.  A  direct  calculation  gives 


Jim  P(d)  =  P(d')  = 


(2.10) 


(■ DaT 2  (ZV)-M«-% 

A0TAa-T(Da*)-2  A0TAa-T(Da*)-2Aa-1A/ 

where  the  convention  l/oo  =  0  is  used.  Clearly,  P(d*)  is  finite. 

To  prove  (2.6),  we  look  at  the  following  two  different  cases. 

Case  1:  i  =  j.  If  d*  =  0,  then  the  limit  of  (2.9)  as  d  goes  to  d *  obviously  exists  and  is 
zero.  If  d*  ^  0,  then  as  d  goes  to  d *  in  (2.9)  and  using  (2.5),  we  have 

=  Itf,.,. (<(•)(!  -  H,,(d-))  =  0. 

Case  2:  i^j.  Now  we  only  have  the  first  term  in  (2.9).  Assume  that  |d*|  =  00;  otherwise 
the  proof  would  be  trivial  because  Hij(d*)  =  0.  Now  we  have  d*  £  Sa,  be.,  0  <  i  <  m.  If 
d*  7^  0,  then  from  the  first  equation  in  (2.7) 

dqpd)  2 


ddj  d. 


{HiMY 


which  has  a  zero  limit  at  d*  by  (2.5).  If  d*  =  0,  then  d*  £  Sp,  i.e.,  m  +  1  <  j  <  n.  Let 
d  — *  d*,  then  from  (2.10)  and  (2.5)  we  have 


%(<**)  =  2ifp(d*)[Aa-1A/?],„ 

ddj  d* 


J— 771 


=  0. 
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So  far  we  have  proved  that  the  Jacobian  matrix  of  q(d)  exists  and  is  zero  at  d* .  A  direct 
calculation  shows  that  for  d  close  but  not  equal  to  d *, 

=  8Hte(d)II3Ad)PtAd)  -  ASjtHie(d)Pit(d)  -  ASi(Hjc(d)Pje{d)  (2.11) 
-  28ij(dtPa(d))2  +  2  8ieSjtPu(d). 

The  second-order  partial  derivatives  in  (2.11)  can  be  shown  to  have  finite  limits  as  d  — >  d* . 
This  completes  the  proof.  □ 

It  should  not  be  a  surprise  that  the  Jacobian  matrix  of  q(d)  is  zero  at  d* .  According  to 
Lemma  2.2  either  the  maximum  (qfd*)  =  1)  or  the  minimum  (qi(d*)  =  0)  is  reached  at  d* 
for  every  component  of  q(d). 

3  Applications  of  the  Indicator 

In  this  section,  we  show  how  the  indicator  developed  in  the  previous  section  can  be  used  in 
a  primal,  dual  or  primal-dual  interior  point  algorithm  to  identify  an  optimal  basis  (under 
appropriate  nondegeneracy  assumptions). 

Primal  Algorithms 

Our  first  result  concerns  primal  algorithms. 

Theorem  3.1  Let  {ar}  C  Rn  converge  to  a  nondegenerate  vertex  ( basic  feasible  solution) 
x*  of  the  linear  program  (1.1).  If  qk  —  q(xk )  is  given  by  (2.2),  then 

lim  qk  =  q*  =  sign(x*).  (3.1) 

k  — ►  oo 

Moreover, 

\\qk  -  q*\\  =  0(\\xk  -  X*\\>). 

Proof:  Let  Sa  in  Lemma  2.2  contain  the  nonzero  components  of  x*  and  let  Sp  contain  the 
zero  components.  Then  obviously  Assumptions  1  and  2  of  Lemma  2.2  hold.  The  fact  that 
x*  is  a  nondegenerate  basic  feasible  solution  guarantees  that  Assumption  3  is  also  satisfied. 
The  first  part  of  the  theorem  now  follows  from  Lemma  2.2. 

The  second  part  follows  directly  from  the  fact  proved  in  Lemma  2.3  that  the  Jacobian 
matrix  of  q(x)  is  zero  at  x*  and  q(x)  is  twice  continuously  differentiable.  □ 

The  theoretical  advantages  of  using  the  qf  s  as  indicators,  as  opposed  to  the  variables 
xfs  themselves,  are  twofold.  First,  the  qfs  provide  a  fixed  separation  which  is  problem 
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independent.  This  is  not  true  of  the  xfs  for  which  the  separation  between  zero  and  nonzero 
variables  can  be  arbitrarily  small.  Second,  the  qfs  converge  quadratically  faster  than  the 
Xj’s.  Hence  an  earlier  termination  is  guaranteed. 

In  our  numerical  experiments,  if  high  accuracy  is  required,  then  usually  in  less  than 
half  of  the  number  of  iterations  needed  for  convergence,  a  clear  0-1  pattern  will  appear.  A 
plausible  explanation  for  this  phenomenon  follows.  If  {aA}  converges  to  x*  at  an  i?-linear 
rate  (which  is  expected  and  observed  in  practice  for  most  primal  interior  point  algorithms 
on  nondegenerate  problems),  then  there  exist  positive  constant  Cx  and  r  <  1,  such  that  for 
k  large 

\\xk-x*\\<Cxrk. 

From  Theorem  3.1,  there  exists  some  constant  Cq,  such  that 

Ik*  -  9i  <  cyk. 

For  a  given  small  positive  number  e  -C  1,  it  will  take  approximately 

In  e  —  In  Cx  In  t 

_  ^  _ 

lnr  lnr 

steps  for  ||xfc  —  x*||  <  e  to  be  satisfied,  while  only  about  half  of  that  number  of  iterations  is 
needed  for  the  satisfaction  of  ||g,fc  —  g*||  <  e. 

In  the  context  of  a  primal  algorithm  Barnes  [3]  used  a  matrix,  not  exactly  of  the  form  of 
(2.1)  but  quite  similar,  to  construct  estimates  of  the  Lagrange  multipliers  associated  with  the 
constraints  Ax  =  b.  He  demonstrated  that  these  estimates  converge  to  the  true  multipliers 
quadratically  faster  than  the  the  nonbasic  variables  converge  to  zero.  While  this  result  is  not 
directly  related  to  our  result,  it  does  have  a  similar  flavor.  Indeed,  Barnes  suggested  using 
these  multiplier  estimates  to  identify  an  optimal  basis. 


Dual  Algorithms 

In  the  dual  affine  algorithms  developed  by  a  number  of  authors  as  variants  of  Karmarkar’s 
algorithm  (see  Adler  et  al.  [1]  and  Monma  and  Morton  [11],  for  example),  the  matrix  H(d) 
in  (2.1)  appears  with  di  =  1/z,-.  For  these  dual  algorithms,  we  have  the  following  result, 
which  is  analogous  to  its  primal  counterpart  Theorem  3.1. 

Theorem  3.2  Let  {zk}  C  Rn  converge  to  a  dual  slack  vector  z*  associated  with  a  non¬ 
degenerate  vertex  of  the  dual  linear  program  (1.2).  If  qk  =  q(dk )  is  given  by  (2.2)  with 
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dk  =  1  /zt 


i  —  1, 2, n,  then 


0,  if  z*  >  0, 
1,  ifz*  =  0. 


Moreover, 

\\qk-q*\\  =  o(\\zk-z*\n 


Proof:  The  proof  given  below  was  suggested  by  an  anonymous  referee.  It  is  considerably 
shorter  than  our  original  proof. 

Let  B  €  R(«-™)xn  havg  fup  row  rank  and  be  such  that  its  rows  are  orthogonal  to  those 
of  A.  For  any  positive  diagonal  matrix  D ,  the  rows  of  BD -1  are  othogonal  to  those  of  AD. 
It  is  straightforward  to  verify  that 

H{d)  =  DAT(AD2AT)~lAD  =  1-  ZBt{BZ2Bt)~1BZ, 
where  Z  =  diag(z)  =  D~l .  Let 

g{z)  =  diag(ZBT(BZ2BT)~1  BZ). 


It  can  be  shown  that  the  nondegeneracy  assumption  implies  that  z*  has  n  —  m  nonzero 
components  and  the  corresponding  n  —  m  columns  of  B  are  linearly  independent.  Applying 
Lemmas  2.2  and  2.3  to  g(z ),  we  have 


lim  g\  =  g*  = 


1,  if  z*  >  0, 
0,  if  z*  =  0. 


and 

\\gk  ~  5*11  =  0(\\zk  —  z*\\2). 


Since  gi(d)  =  1  —  gi(z),  the  conclusions  of  the  theorem  follow  immediately.  □ 


Primal-Dual  Algorithms 

Our  technique  is  also  applicable  to  primal-dual  algorithms  (see  Kojima  et  al.  [10],  for  ex¬ 
ample)  where  the  matrix  H(dk)  in  (2.1)  appears  with  dk  —  \J xk / zf  and  xk  and  zk  are  kept 
strictly  positive  for  all  k.  Since  both  the  primal  and  the  dual  programs  are  involved,  different 
results  can  be  obtained  by  using  different  combinations  of  assumptions  (primal  nondegener¬ 
acy,  dual  nondegeneracy,  strict  complementarity).  Because  these  results  and  their  proofs  are 
all  very  similar,  we  choose  only  to  present  one  result  that  is  based  on  primal  nondegeneracy. 
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Theorem  3.3  Let  {xk}  C  Rn  and  {zfc}  C  R"  be  positive  and  converge  to  an  optimal  solution 
x*  of  the  primal  program  (1.1)  and  an  optimal  dual  slack  vector  z*  of  the  dual  program  (1.2), 
respectively.  Assume  that  (i)  x*  is  a  nondegenerate  vertex  of  (1.1);  and  (ii)  there  are  positive 
constants  a;  and  /3{  such  that  for  k  large 


aizi  <  xk  <  fczl  if  x*  =  0  and  z*  =  0. 

Let  qk  =  q(dk)  be  given  by  (2.2)  with  dk  =  yfxfjzf,  i  =  1,2,  ...,n.  Then  qk  converges  to  a 
0-1  vector  with  m  ones  and  n  —  m  zeros,  and 


lim  qk  =  q*  = 

K—+00 


0,  if  x*i  =  0, 
1,  if  x*  >  0. 


Proof:  Define 

d*  =  lim  sup  \jxk!zk. 

k—KDO 

Recall  that  x*  is  a  nondegenerate  vertex  of  (1.1).  There  are  m  nonzero  components  and 
n  —  m  zero  components  in  x*.  Let  Sa  in  Lemma  2.2  contain  those  d*' s  corresponding  to 
the  nonzero  components  of  x*  and  Sp  contain  those  d*’s  corresponding  to  the  zero  a;*’s.  By 
complementarity  x*z*  =  0,  so  we  have  d*  =  oo  for  d*  e  Sa.  On  the  other  hand,  d*  =  0 
for  d*  €  Sp  and  z*  >  0.  In  addition,  d*  <  oo  for  d*  €  Sp  and  z*  =  0  by  Assumption  (ii). 
Therefore,  Sa  and  Sp  satisfy  Assumptions  1  and  2  of  Lemma  2.2.  The  assumption  that  x* 
is  a  vertex  guarantees  that  Assumption  3  of  Lemma  2.2  is  also  satisfied.  It  is  not  difficult 
to  see  from  the  proof  of  Lemma  2.2  that  as  long  as  the  three  assumptions  in  Lemma  2.2  are 
satisfied,  we  still  have  that  q(dk)  converges  to  the  0-1  vector  with  m  ones  and  n  —  m  zeros  as 
given  in  (2.4)  even  though  we  have  d*  =  lim  sup  d\  instead  of  d*  =  lim  dk  for  some  d*  €  Sp. 
It  also  follows  that  q*  =  1  whenever  x*  >  0.  This  completes  the  proof  of  the  theorem.  □ 
The  second  assumption  in  the  above  theorem  basically  assumes  that  if  both  X{  and  Z{ 
converge  to  zero,  then  they  do  so  at  the  same  rate.  This  assumption  seems  to  be  quite 
reasonable. 


Basis  Identification  Criterion 

Our  optimal  basis  identification  criterion  based  on  the  previous  theorems  is  defined  as  follows: 
Given  a  small  positive  number  t,  if 

{i  ■  qi{dk)  >  1  -  e,  1  <  *  <  n}  \J{i  :  qt(dk)  <  e,  1  <  i  <  n}  =  {1, 2, ...,  n},  (3.2) 

then  test  to  see  if  an  optimal  basis  has  been  identified. 
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If  an  orthonormal  basis  U(d)  of  DAT  (i.e.,  of  the  column  space  of  DA° )  is  computed  by 
a  QR  method  or  an  SVD  method,  then  obviously  H(d)  =  U(d)U(d)T  and 

qt{d)  =  Ui(d)TUi(d),  i  =  1,2, 

where  u,(d)  is  the  Tth  row  of  U(d).  In  this  case,  the  cost  of  computing  the  indicator  vector 
is  0(mn)  for  dense  matrices. 

If  a  Cholesky  factorization  AD2AT  =  L(d)L(d)T  is  computed  (as  is  done  currently  in 
most  implementations  of  variants  of  the  Karmarkar  algorithm)  instead  of  a  QR  factorization 
of  DAt ,  then  to  obtain  an  orthonormal  basis  U(d)  for  DAT,  one  needs  to  solve  the  lower 
triangular  system  L(d)U(dy  =  AD  which  requires  0(m2n )  operations  for  dense  matrices. 
As  we  can  see,  this  cost  is  one  order  of  m  higher  than  what  is  required  when  an  orthonormal 
basis  of  DAt  is  available. 

4  Extension  to  Degenerate  Problems 

In  general,  the  indicator  as  defined  in  (2.2)  is  not  directly  applicable  to  degenerate  problems. 
However,  we  will  show  in  this  section  that  for  primal  interior  point  algorithms  (i.e.,  d  =  x) 
a  modified  primal  indicator  can  be  devised  to  handle  problems  with  only  primal  degeneracy, 
or  more  precisely,  primal  problems  that  have  a  unique  solution.  We  will  focus  on  the  primal 
indicator  for  primal  algorithms  only,  though  the  results  obtained  also  apply  to  the  dual 
indicator  for  dual  algorithms. 

We  consider  a  sequence  {aA}  that  converges  to  a  degenerate  basic  feasible  solution  x* 
with  r  nonzero  components,  where  r  <m.  We  first  assume  that  the  first  r  rows  of  AX*  are 
linearly  independent,  i.e.,  ArX*  has  full  row  rank,  where  Ar  is  the  matrix  that  consists  of 
the  first  r  rows  of  A.  Then  the  diagonal  of  XAj(ArX2Aj)+ArX,  which  we  will  call  <?(r)(z), 
will  tend  to  sign(x*)  and  can  be  used  as  the  indicator.  It  is  fortunate  that  one  need  not 
know  the  number  r  in  advance  and  the  indicator  q(r\x)  is  readily  available  if  an  orthonormal 
basis  U(x)  of  X AT  is  computed  (by  a  QR  or  an  SVD  method,  for  example).  One  only  need 
look  through  q^\x)  for  j  =  1,2,  ...,m  to  search  for  a  0-1  pattern,  where 

=  9p_1)(®)  +  (U,j(x))2  and  g,'0)(x)  =  0,  i  =  1,2 (4.1) 
The  cost  is  still  0(mn )  for  dense  matrices. 

The  assumption  that  the  first  r  rows  of  AX*  are  linearly  independent  can  be  removed 
in  the  following  way.  Let  Xk AT Pk  =  UkRk ,  where  Uk  is  an  orthonormal  basis  of  XkAT, 
Rk  is  upper  triangular  and  Pk  is  a  permutation  matrix  which  forces  the  diagonal  elements 
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of  Rk  to  appear  in  descending  order  by  absolute  value  (this  can  be  done  during  the  QR 
decomposition).  Then  as  xk  converges  to  x*,  the  last  m  —  r  diagonal  elements  of  Rk  will 
tend  to  zero  and  the  first  r  rows  of  PkAXk  will  be  linearly  independent  for  all  sufficiently 
large  k.  If  we  denote  Uk  as  the  first  r  columns  of  Uk ,  then 

lim  q^T\xk)  =  lim  diag(UkUk  T)  =  sign(x*).  (4-2) 

k — ►oo  k — kx) 

The  indicator  q^(xk)  can  be  computed  recursively  using  (4.1). 

We  state  the  above  discussion  formally  as  a  theorem. 

Theorem  4.1  Let  C  R”  converge  to  a  degenerate  basic  feasible  solution  x*  of  the 

linear  program  (1.1)  with  r  nonero  components  where  r  <  m.  Let  Ar  be  an  r  x  n  submatrix 
of  A  such  that  ArX*  has  rank  r.  If  qk  =  q(xk)  is  given  by  (2.2)  with  A  replaced  by  Ar,  then 

lim  qk  =  q*  =  sign(x*). 

k—*oo 

Moreover, 

Ik'1  -  «1  =  0(||x*  -  x-f). 

Proof:  The  proof  is  essentially  the  same  as  that  of  Theorem  3.1,  so  we  omit  it.  □ 

It  is  worth  observing  from  (4.1)  that  once  q\3o\x*)  —  1  for  some  jo  <  m,  then  q\J\x*)  =  1 
for  all  jo  <  j  <  m.  This  is  so  because  of  the  monotonicity  of  q\3\x)  with  respect  to  j  and 
the  fact  that  q(x)  <  1. 

It  is  unfortunate  that  for  problems  with  both  primal  and  dual  degeneracy,  our  indicator 
is  incapable  of  identifying  all  the  zero  and  nonzero  variables  as  xk  — >  x*  (or  zk  — ►  z*)  because 
some  components  of  the  indicator  vector  may  not  have  limits  at  optimality.  This  drawback 
undoubtedly  limits  the  practical  usefulness  of  the  indicator  because  most  real-world  problems 
do  have  both  primal  and  dual  degeneracies. 

5  Numerical  Behavior  of  the  Indicator 

To  corroborate  our  theory,  we  have  performed  some  numerical  experiments  to  explore  the 
numerical  behavior  of  the  indicator.  In  our  experiments  we  used  randomly  generated  prob¬ 
lems,  fully  aware  that  they  are  by  no  means  representative  of  real-world  problems.  We  stress 
that  the  numerical  results  have  not  demonstrated  the  effectiveness  of  our  indicator  approach, 
but  have  corroborated  our  theoretical  convergence  results. 

Both  nondegenerate  problems  and  degenerate  problems  with  only  primal  degeneracy  are 
constructed.  The  methods  of  construction  are  described  below. 
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To  generate  the  cost  vector  c  and  the  first  m  —  1  rows  of  the  constraint  matrix  A,  we 
use  the  Matlab  M-file  “rand”  to  obtain  uniformly  distributed  random  numbers  in  the  unit 
interval  (0,1)  and  then  use  the  tangent  function  tan{'K{x  —  1/2))  to  map  (0,1)  onto  the 
entire  real  line  (— oo,+oo).  For  the  m-th  row  of  A,  we  apply  the  mapping  tan( 7rx/2))  to 
make  all  the  elements  of  that  row  strictly  positive  so  that  the  feasible  set  will  be  bounded. 
The  strictly  positive  initial  feasible  point  x0  is  also  obtained  this  way.  Caution  is  taken  to 
ensure  that  the  generated  A  matrices  are  always  of  full  row  rank.  The  right-hand  side  vector 
b  is  set  equal  to  Ax0.  After  A,  b,  c  and  x0  are  generated,  we  replace  A  by  AX0,  c  by  X0c  and 
x0  by  e  —  the  vector  of  all  ones,  where  Xo  =  diag(x0).  These  replacements  are  equivalent 
to  applying  an  affine  transformation  to  the  problem  so  that  e  is  feasible  for  the  transformed 
problem.  Although  there  is  no  guarantee  in  theory  that  random  problems  generated  in  this 
manner  should  be  nondegenerate,  in  practice  the  chances  of  getting  degenerate  problems 
seem  to  be  extremely  small. 

To  construct  primal  degenerate  problems,  we  first  solve  a  nondegenerate  problem  and 
obtain  the  solution  x*.  Then  we  generate  an  i  by  n  random  matrix  B  for  1  <  l  <  n  —  m 
and  redefine  the  data  A,  b  and  c  by 


a.ja  ° 

[  B  B(x *  —  e)  J  ’ 


(5.1) 


We  also  extend  x°  =  e  to  the  in  +  l)-dimensional  vector  of  all  ones  which  is  feasible  for 
the  new  problem.  The  solution  to  the  new  problem  is  (x*T,0)T  which  is  obviously  primal 
degenerate  because  there  are  still  m  nonzeros  in  the  solution  but  there  are  now  m  +  C 
constraints.  Also  we  know  that  the  first  m  rows  of  AX*  (with  new  A  and  X*)  are  linearly 
independent.  Obviously,  the  problem  has  a  unique  solution. 

In  our  tests,  we  implemented  the  so-called  standard  form  variant  of  the  Karmarkar  pro¬ 
jective  algorithm,  a  primal  algorithm,  developed  independently  by  a  number  of  authors.  In 
this  implementation  we  use  the  procedure  suggested  by  de  Ghellinck  and  Vial  [4]  and  by 
Ye  and  Kojima  [18]  (both  based  on  an  earlier  work  of  Todd  and  Burrell  [14])  which  uses 
duality  to  construct  and  update  lower  bounds  rj  for  the  objective  function  so  the  duality 
gap  is  minimized.  In  our  numerical  experiments,  we  set  the  initial  lower  bound  t/°  to  — 106, 
which  happens  to  be  adequate  for  our  experiments.  We  have  observed  that  the  performance 
of  the  algorithm  is  not  sensitive  to  the  values  of  the  initial  lower  bound. 

Instead  of  trying  to  minimize  the  Karmarkar  potential  function  in  the  search  direction  at 
each  iteration  by  a  line  search,  we  used  a  simple  back-tracking  technique  to  ensure  that  the 
potential  function  was  reduced  by  a  fixed  amount  at  each  iteration.  The  initial  step  length 
was  set  to  0.95  times  the  step  length  that  takes  the  iterate  to  the  boundary  of  the  simplex. 
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The  program  was  written  in  Matlab  and  run  on  a  SUN  workstation  network  at  Rice 
University  with  a  machine  epsilon  of  about  2.22  x  10-16. 

The  stopping  criterion  used  in  our  tests  was  that  the  duality  gap  be  less  than  10~8,  i.e., 

cTxk  —  r]k  <  10-8.  (5.2) 

We  tested  the  optimal  basis  identification  criterion  (3.2)  for  e  =  0.1  on  10  randomly 
generated  nondegenerate  problems  with  n  ranging  from  20  to  200.  We  also  tested  the 
modified  identification  procedure  using  the  recursive  search  as  prescribed  by  (4.1)  for  10 
randomly  generated  primal  degenerate  problems.  The  numerical  results  are  included  in  the 
following  two  tables.  In  the  tables,  the  iteration  numbers  at  which  an  optimal  basis  is 
identified  and  the  stopping  criterion  is  satisfied,  respectively,  are  given  in  the  fourth  and 
sixth  columns.  The  corresponding  duality  gaps  are  listed  in  the  fifth  and  seventh  columns, 
respectively.  The  rest  of  the  tables  are  self-explanatory.  We  also  include  several  figures 
which  illustrate  the  fast  convergence  of  the  indicator.  Each  curve  in  the  figures  represents 
the  history  of  a  component  of  the  indicator  during  the  iterative  process.  As  one  can  see  from 
these  graphs,  the  convergence  of  the  indicator  vectors  (i.e.,  the  separation  of  the  two  groups 
of  indicator  components  converging  to  either  zero  or  one)  is  indeed  much  faster  than  that 
of  the  iterates.  From  Tables  1  and  2,  we  see  savings  of  about  30%  to  70%  in  the  number  of 
iterations. 

Although  these  numerical  experiments  have  confirmed  our  convergence  analysis,  the  com¬ 
putational  efficiency  of  the  indicator  approach  depends  on  how  effectively  the  indicator  can 
be  calculated  and  used.  For  dense  matrices,  the  cost  of  computing  the  indicator,  given  a 
Cholesky  factor,  is  comparable  with  the  cost  of  forming  AD2  A1 ,  which  is  acceptable.  How¬ 
ever,  recently  David  Gay  [7]  demonstrated  that  for  sparse  problems,  the  computation  of  the 
indicator  was  the  dominant  work  in  an  iteration  and  in  some  cases  this  cost  was  prohibitively 
high.  His  results  suggest  that  despite  its  fast  convergence,  the  indicator  may  not  lend  it¬ 
self  to  efficient  implementations  for  solving  sparse  problems  in  the  framework  of  Cholesky 
factorization. 
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number  of  iterations 


History  of  the  indicator 


number  of  iterations 


History  of  the  indicator 


Table  1:  Numerical  results  for  nondegenerate  problems 


Problem 

number 

variables 

n 

constraints 

m 

basis  identified 

iter.  gap 

algorithm  stopped 
iter.  gap 

Iterations 

saved 

1 

20 

10 

8 

.20E+2 

24 

.42E-S 

67% 

2 

40 

20 

13 

.23E+1 

27 

.94E-8 

52% 

3 

60 

30 

14 

.46E+2 

29 

.35E-8 

52% 

4 

80 

40 

16 

.85E+0 

30 

.32E-8 

47% 

5 

100 

50 

13 

.12E+2 

29 

.63E-8 

55% 

6 

120 

60 

19 

.66E+0 

33 

.26E-8 

42% 

7 

140 

70 

18 

.18E+0 

30 

.95E-8 

40% 

8 

160 

80 

22 

.69E-1 

34 

.83E-8 

35% 

9 

180 

90 

19 

.16E+0 

32 

.30E-8 

40% 

10 

200 

100 

27 

.22E-1 

38 

.55E-8 

29% 

Table  2:  Numerical  results  for  degenerate  problems 


Problem 

number 

variables 

n 

constraints 

m 

basis  identified 
iter.  gap 

algorithm  stopped 
iter.  gap 

Iterations 

saved 

1 

20 

12 

8 

,69e+l 

23 

.71e-8 

65% 

2 

40 

24 

11 

.76e+l 

27 

.34e-8 

59% 

3 

60 

36 

13 

.lOe+1 

27 

.62e-8 

52% 

4 

80 

48 

15 

.38e+0 

28 

.96e-8 

46% 

5 

100 

60 

14 

.18e+l 

29 

.52e-8 

52% 

6 

120 

72 

15 

.72e+l 

31 

.42e-8 

52% 

7 

140 

84 

16 

.13e+0 

29 

.51e-8 

45% 

8 

160 

96 

16 

.49e+l 

32 

.41e-8 

50% 

9 

108 

15 

,72e+0 

29 

.69e-8 

48% 

10 

120 

18 

.32e+l 

34 

.35e-8 

47% 
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6  Concluding  Remarks 


In  this  paper,  we  have  studied  an  indicator  for  identifying  optimal  bases  in  the  setting  of 
interior  point  linear  programming  algorithms.  This  indicator  has  the  theoretical  properties 
of  being  problem-independent  and  rapidly  convergent  for  linear  programming  problems  with 
unique  solutions.  It  is  applicable  to  primal,  dual  and  primal-dual  algorithms.  Our  randomly 
generated  numerical  examples  have  confirmed  our  theoretical  analysis,  showing  that  the  use 
of  the  indicator  can  reduce  the  number  of  iterations  by  a  large  percentage. 

From  a  theoretical  point  of  view,  we  believe  that  the  main  result  of  this  work  is  the 
establishment  of  the  convergence  properties  of  the  indicator  on  problems  without  general 
degeneracy.  However,  the  practical  applicability  of  the  indicator  to  real-world  problems  is 
severely  limited  by  two  factors.  First,  it  is  not  applicable  to  problems  with  both  primal  and 
dual  degeneracies.  Second,  the  relative  cost  of  computing  the  indicator  can  be  very  high  for 
large  sparse  problems.  Although  it  is  still  possible  that  the  method  may  find  application  in 
some  very  special  problems,  at  this  point  it  seems  to  be  mainly  of  theoretical  interest. 
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